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We present the results of a search for new particles that lead to a Z boson plus jets in pp 
collisions at *Js = 1.96 TeV using the Collider Detector at Fermilab (CDF II). A data sample with a 
luminosity of 1.06 fb _1 collected using Z boson decays to ee and mm is used. We describe a completely 
data-based method to predict the dominant background from standard-model Z+jet events. This 
method can be similarly applied to other analyses requiring background predictions in multi-jet 
environments, as shown when validating the method by predicting the background from VF+jets in 
ti production. No significant excess above the background prediction is observed, and a limit is set 
using a fourth generation quark model to quantify the acceptance. Assuming BR(b' — > bZ) — 100% 
and using a leading-order calculation of the b' cross section, b' quark masses below 268 GeV/c 2 are 
excluded at 95% confidence level. 
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I. INTRODUCTION 

This paper presents a search for new particles de- 
caying to Z gauge bosons created in pp collisions at 
y/s = 1.96 TeV with the CDF II detector at the Fermi- 
lab Tevatron, extending and complementing other work 
with such final states [T| El H] . A variety of extensions 
to the standard model predict new particles with cou- 
plings to Z bosons [51 |5J |?1 |H1 IS] • We wish to discover or 
rule out these types of models, while maintaining model 
independence in the search. That is, while these theo- 
ries offer guidance about the possible characteristics of 
physics beyond the standard model, they do not neces- 
sarily correspond to what actually exists in nature, and 
so the analysis is not tailored to specific models. 

Of course, some assumptions are necessary in choosing 
how to discriminate between the standard model back- 
ground and new signals. We examine final states with 
Z bosons and additional jets. In particular, we focus 
on final states in which there arc at least 3 jets, each 
with at least 30 GeV of transverse energy E?- This as- 
sumption was motivated by studying the optimal kine- 
matic selection of a specific model, the fourth generation 
model [5j. In the fourth generation model, an additional 
pair of heavy quarks is added to the standard model's 
three. The production mechanisms of the new down- 
type quark (called the b') would be identical to that of 
the top quark, with pair-production having the largest 
cross section. Depending on its mass, the direct tree-level 
decays of the b' could be either kinematically forbidden 
or heavily Cabibbo-suppressed. These situations could 
give rise to a large branching ratio of b' — ► bZ via a loop 
diagram. While the selection was chosen as the optimal 
set of kinematic cuts using this model as a signal, this 
analysis constrains all models with Z+3 jet final states. 

The dominant background for this final state is from 
standard model Z production with jets from higher order 
QCD processes. A leading order calculation of this back- 
ground is insufficient. Use of higher order calculations 
is complicated because it involves hard-scattering matrix 
elements in combination with soft non-perturbative QCD 
processes. Recent NLO predictions [TU] have been used 
[IT] with the aid of Monte Carlo simulations to account 
for the non-perturbative overlap. Any such method re- 
quires validation with data. In this paper, we develop a 
different approach that uses the data as more than a val- 
idation tool, and uses it alone for the background estima- 
tion. In this approach, we extrapolate the jet transverse 
energy distributions from a low energy control region of 
the data into the high energy signal region. 



This paper is organized as follows. Section |TT] contains 
a brief overview of the portions of the CDF II detector 
relevant to this measurement. Section [TTT] lists the trig- 
ger requirements, and describes and motivates the signal 



sample selections. Section IV lists the backgrounds. Sec- 
tion [V] describes, validates, and applies the method of 
predicting the dominant background. In Sec. |VI| the pre- 
dictions for the remaining backgrounds are described. In 



Sec. VII we present the results of the search, and con- 
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II. THE CDF II DETECTOR 

The CDF II detector is described in detail elsewhere 
|12j ; here, only the portions required for this analysis are 
described. We first describe the coordinate system con- 
ventions. In the CDF coordinate system, the origin is the 
center of the detector, and the z axis is along the beam 
axis, with positive z defined as the proton beam direction. 
The x axis points radially outward from the Tevatron 
ring, leaving the y axis direction perpendicular to the 
earth's surface with positive direction upward. Spherical 
coordinates are used where appropriate, in which is the 
polar angle (zero in the positive z direction), <j) is the az- 
imuthal angle (zero in the positive x direction), and the 
pseudorapidity r\ is defined by r\ = — ln[tan(0/2)]. At 
hadron colliders, transverse energies and momenta are 
usually the appropriate physical quantities, defined by 
Ei = EsinO and px = ps'md (where E is a particle's 
energy and p is the magnitude of a particle's momen- 
tum) . 

A tracking system is situated directly outside the 
beam pipe and measures the trajectories and momenta 
of charged particles. The innermost part of the tracking 
system is the silicon detector, providing position mea- 
surements on up to 8 layers of sensors in the radial region 
1.3 < r < 28 cm and the polar region |^| < 2.5. Out- 
side of this detector lies the central outer tracker (COT), 
an open-cell drift chamber providing measurements on 
up to 96 layers in the radial region 40 < r < 137 cm 
and the polar region \rj\ < 1. Directly outside of the 
COT a solenoid provides a 1.4 T magnetic field, allow- 
ing particle momenta to be obtained from the trajectory 
measurements in this known field. 

Surrounding the tracking system, segmented electro- 
magnetic (EM) and hadronic calorimeters measure par- 
ticle energies. In the central region, the calorimeters 
are arranged in a projective barrel geometry and cover 
the polar region \r)\ < 1.2. In the forward region, the 
calorimeters are arranged in a projective "end-plug" ge- 
ometry and cover the polar region 1.2 < |7j| < 3.5. Two 
sets of drift chambers, one directly outside the hadronic 
calorimeter and another outside additional steel shield- 
ing, measure muon trajectories in the region \r)\ < 0.6; 
another set of drift chambers similarly detects muons 
in the region 0.6 < \r)\ < 1. Muon scintillators sur- 
round these drift chambers in the region \rj\ < 1 for trig- 



5 



ger purposes. A luminosity measurement is provided by 
Cherenkov detectors in the region 3.7 < \r)\ < 4.7 via a 
measurement of the average number of pp collisions per 
crossing [T3] , 

Collision events of interest are selected for analysis of- 
fline using a three level trigger system, with each level 
accepting events for processing at the next level. At level 
1, custom hardware enables fast decisions using rudimen- 
tary tracking information and a simple counting of recon- 
structed objects. At level 2, trigger processors enable de- 
cisions based on partial event reconstruction. At level 3, 
a computer farm running fast event reconstruction soft- 
ware makes the final decision on event storage. 

III. DATA SAMPLE AND EVENT SELECTION 

We first describe the baseline Z selection, and then 
describe the kinematic selection used to discriminate the 
potential signal from the standard model background. 
The kinematic selection is chosen and backgrounds are 
predicted a priori, before looking in the signal region. 
While remaining as data-driven as possible throughout 
the analysis, Monte Carlo simulation is used in some 
studies, consistency checks, and for illustration purposes. 
In all cases, the Monte Carlo events are generated with 
pythia [T3] and the detector responses are modeled with 
a GEANT simulation [TS] as described in [T^]. 

A. Baseline Z Selection 

The data sample consists of Z — > ee and Z — > fifi 
candidate events collected using single electron and muon 
triggers. The electron trigger requires at least one central 
electromagnetic energy cluster with Et > 18 GeV and a 
matching track with px > 9 GeV/c. The muon trigger 
requires at least one central track with px > 18 GeV/c 
with matching hits in the muon drift chambers. The 
average integrated luminosity of these data samples is 
1.06 fb" 1 PI]. 

Z candidate events are selected offline by requiring at 
least one pair of electron or muon candidates both with 
Pt > 20 GeV/c and invariant mass in the range 81 < 
Mu < 101 GeV/c 2 . The electron and muon identification 
variables are described in detail in Refs. [TH1 EE)- The 
selection is described briefly here. To increase efficiency, 
only one of the lepton pair has stringent identification 
requirements (the "tight" candidate) , while on the other 
lepton the identification requirements are relaxed (the 
"loose" candidate). 

"Loose" electron candidates consist of well-isolated 
EM calorimeter clusters with low energy in the hadronic 
calorimeter; in the central part of the detector (|?y| < 1.2) 
well-measured tracks from the COT are required; in the 
forward parts of the detector > 1.2) no track is re- 
quired, but the shower shape in the EM calorimeter is re- 
quired to be consistent with that expected from electrons. 




FIG. 1: Distribution of Mu of Z — > ee and Z — > fj.fi data 
(black points and errors) using the baseline Z selection de- 
scribed in the text. Overlaid are standard model Z — > ee 
and Z — > fifi Monte Carlo events, normalized to the num- 
ber of events expected with the given luminosities using the 
expected cross section of 250 pb. 

"Tight" electron candidates have all the requirements of 
"loose" candidates, and are additionally required to be 
central (|?7| < 1.2), to have a shower shape consistent with 
that expected from electrons, to have calorimeter posi- 
tion and energy measurements consistent with its match- 
ing track, and to have no nearby tracks consistent with 
that expected in electrons from photon conversions. 

"Loose" muon candidates consist of well-measured 
tracks in the COT and well-isolated EM and hadronic 
calorimeter clusters with minimal energy deposits. 
"Tight" muon candidates have all the requirements of 
"loose" candidates, and are additionally required to have 
matching hits in the muon drift chambers. 

Finally, all electron and muon pairs are required to be 
consistent with originating from the same z vertex and 
to have a time-of-flight difference (as measured by the 
COT) inconsistent with that expected for cosmic rays. 
They are also required to be separated in (jj by an angle 
greater than 5° to remove two lepton candidates mis- 
reconstructed from a single lepton. 

Using this selection, the distribution of Mu is plotted 
and compared to standard model Z Monte Carlo simu- 
lation in Fig. [T] 

B. Kinematic Selection 

The analysis focuses on topologies with large numbers 
of highly energetic jets in the final state, for which the 
signal (from the decay of heavy objects) can be better 
separated from standard model Z+jet production. Jets 
are clustered using the "midpoint" clustering algorithm 
P35] with a cone size of 0.4 radians. Corrections are ap- 
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plied to extrapolate the jet energies back to the parton 
level using a generic jet response |20j . Jets are required 
to have \rj\ < 2. 

The following discriminators are used: 

N^ t = Number of jets in the event with 

E T > X GeV 
J* = Scalar sum of Et of jets in the 
event with E T > X GeV 

The thresholds X as well as the cut values on these vari- 
ables are determined by optimization |21j . In the opti- 
mization we use the figure of merit 5/(1.5 + \ B) (where 
5 is the expected number of signal events and B is the 
expected number of background events) to quantify the 
sensitivity as a compromise between best discovery and 
best limit potential [22 , 23J . In the low background region 
(B <C 1), maximizing this figure of merit is equivalent to 
maximizing the signal efficiency. In the high background 
region (B> 1), this figure of merit has the same behavior 
as 5/v A B. For the optimization study, pp — > b'9 Monte 
Carlo simulations with a range of masses are used as the 
signal 5. Standard model Z Monte Carlo simulations are 
used for the background B. 

In order to be sensitive to a range of masses, we must 
take into account the generic behavior of new signals: 
as mass increases the cross section decreases while the 
transverse energy spectra become harder. Therefore, to 
be optimally sensitive to higher mass signals, we cut at 
larger values of N- Jet and Jt thus removing more of the 
background to give sensitivity to the lower cross sections. 

For the sake of simplicity, we desire that our selection 
only changes gradually with mass and uses the same Ex 
threshold on all jets. With a simple selection, the data- 
based background prediction method becomes easier. To 
confirm that this desire for simplicity does not consid- 
erably reduce the search sensitivity, and to understand 
what cut values and thresholds to use, we first establish 
a "target" selection. The "target" selection is defined as 
the selection with the highest sensitivity when placing 
cuts on the individual jet Et's and Jt- This is found 
by scanning through all possible cuts on J T ° (that is, Jt 
is calculated with a 10 GeV threshold on the jets) and 
all possible Et thresholds for up to 4 jets (ordered by 
Et), and finding the point with the optimal sensitivity. 
In this scan, step sizes of 10 GeV are used for the jet 
Et thresholds, and a step size of 50 GeV is used for J T °. 
This scan is done independently for b' masses in the range 
100 < m h , < 350 GeV/c 2 with a step size of 50 GeV/c 2 . 

The optimal points found by this scan for a b' mass 
of 150 GeV/c 2 are shown in column 2 of Table [i] These 
cut values give the best possible sensitivity at this mass 
point when placing cuts on the individual jet Et's and 
J T °. Again, we wish to choose a simple selection that 
gradually changes as a function of mass, and use the tar- 
get sensitivities at all mass points for comparison. Based 
on the optimal target points for b' masses in the range 
100 < my < 350 GeV/c 2 , we choose the simpler require- 
ments of ATpo > 3 and J T ° > myc 2 - The sensitivity of 





Values 


Values of simple 


Variable 


from scan 


selection 


1 thresh.: 


50 


30 


2 thresh.: 


30 


30 


3 thresh.: 


30 


30 


Eij: thresh.: 


20 





Jt cut: 





150 


N sis : 


48.5 


75.5 


AW 


2.60 


13.8 


5/(1.5 + VB): 


15.6 


14.5 



TABLE I: Optimal point compared with the simple selection 
of N?° t > 3 and J^° > 150, for the m V = 150 GeV/c 2 mass 
point. Here, N s i s is the number of signal events expected in 
lfb" 1 after the given selection using b' Monte Carlo simula- 
tions. iVbkg is the number of background events expected in 
lfb -1 after the given selection using standard model Z Monte 
Carlo simulations. In this optimization study, 2.7 x 10 stan- 
dard model Z events were used; 1500 signal events were used 
(both counted before jet selection). 



the simple requirements is compared to the target sen- 
sitivity in column 3 of Table [i] for the 150 GeV/c 2 mass 
point. 

From the table it is apparent that, for mj,/ = 
150 GeV/c 2 , the sensitivity of the simple cuts is only 
negligibly less than the target sensitivity. We find the 
same to be true for all mass points studied, except for 
the nib' — 100 GeV/c 2 mass point. In that case, how- 
ever, the sensitivity of the simple cuts is still adequate be- 
cause of the larger cross sections for lower mass particles 
[2"1] . In addition, low masses near 100 GeV/c 2 are less 
interesting as they are already more tightly excluded |25j . 
Thus, we conclude that the simpler selection of N^. > 3 
and J T ° > myc 2 is nearly optimal for the mass range of 
interest . 

In the above, Jt was calculated using a 10 GeV Et 
threshold on the jets. For the purposes of the background 
estimation, it is simpler to use the same Et threshold on 
Jt as one uses on the N- ]ct variable. Therefore, a 30 GeV 
threshold is used when calculating Jt- This was found to 
give a small decrease in sensitivity in the b' model with 
the benefit of a gain in simplicity. 

The kinematic jet selection was found to be optimal 
when using the fourth generation model as the signal. 
When optimizing using the figure of merit 5/(1.5 + VB), 
the optimal point is independent of the normalization 
of the signal. That is, any model with a different cross 
section but the same kinematic distributions will give 
the same optimal point. In addition, the shape of the 
kinematic distributions are mostly determined by the b' 
mass. We therefore expect that this selection is nearly 
optimal for all models with heavy particles produced in 
pairs and decaying to Z+jet. In general, this selection 
is sensitive to any model with high Et jets in the final 
state. It may not be optimal for an arbitrary model, but 
designing a simple selection that is optimal for the entire 
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class of Z+high Et jet models is not possible. 

In this optimization, we assumed new signals would 
lead to final states consisting of a Z boson and many high 
Et jets. Of course, some assumption about signal char- 
acteristics must be made in order to understand how to 
separate signal from background. These assumptions will 
naturally reduce the model independence of the search. 
There is a trade-off between the specificity of these as- 
sumptions and the sensitivity to a particular model. For 
example, in nearly all new physics models with Z bo- 
son final states, the transverse momentum spectrum of 
the Z is harder than for standard model Z production. 
This is because, in these models, the Z is usually a decay 
product of a massive particle. One would conclude that 
the Z transverse momentum is a very model-independent 
variable, and therefore well-motivated. However, we find, 
in the b' model sensitivity study, that the jet kinematic 
requirements have much higher sensitivity than the Z 
transverse momentum. The cost of this sensitivity is a 
loss of generality: with this assumption we are no longer 
sensitive to Z final states without high Et jets. The 
sensitivity of the b' model can be further enhanced by 
requiring b jets using displaced vertices (because of the 
b' — > bZ decay), again with a cost to generality. In our 
analysis, as a compromise between model independence 
and sensitivity, we choose to require additional jets in the 
event. 

To summarize, after selecting Z — > ee and Z —* /i/i 
events, the kinematic selection is: 

• > 3, and 

That is, Z events with N?® t > 3 are selected, and the 
JjP distribution is scanned for an excess. Step sizes of 50 
GeV are used. 

IV. BACKGROUNDS 

In the signal region described above, there are poten- 
tial backgrounds from the following sources: 

• single- Z production in conjunction with jets, 

• multi-jet events, where two jets fake leptons, 

• cosmic rays coincident with multi-jet events, 

• WZ+jets, where the W decays to jets, 

• ZZ+jets, where one of the Z's decays to jets, 

• WW+jets, where both W's decay to leptons, and 

• tt+jets, where both W's decay to leptons. 

The dominant background is from standard model 
single- Z production in conjunction with jets. Since be- 
yond leading-log order diagrams make potentially large 



contributions to events with > 3, calculation of this 
background from theoretical first principles is extremely 
difficult, and therefore would require careful validation 
with data. Rather than using data as merely a validation 
tool we take a different approach, and instead measure 
the background directly from data, and with data alone. 
The following section is devoted to describing this predic- 
tion technique for the dominant background from Z+jet. 
As this technique has not been applied previously, it is 
explained thoroughly, with careful validation studies de- 
scribed. The remaining backgrounds are estimated in 
Sec. EH 

V. DATA-BASED Z+JET BACKGROUND 
PREDICTION TECHNIQUE 

Given the above selection, there are two tasks: the 
total number of background events with N?® t > 3 must 
be predicted, and the shape of the J|P distribution af- 
ter this cut must be predicted. When combined, these 
two components give the full normalized J T ° distribution 
prediction. The background for events with N?? t > 3 
and any J T ° cut can be obtained from this distribution. 
The method for predicting each of the two components 
is described separately in the following two sections. 

In each of the prediction methods, fits to various jet Et 
distributions are used. A parameterization that describes 
the shapes of these jet Et distributions well is therefore 
required. The parameterization used is: 

e -E T /p! 



where the pi are fitted parameters. This parameteriza- 
tion was motivated by observations in Monte Carlo sim- 
ulations, control regions of data, and phenomenological 
studies that: at low Et, the jet Et shape follows a power 
law function; at high Et, it follows an exponential de- 
cay function. The above parameterization satisfies these 
limiting behaviors. With the above convention, the pa- 
rameter pi has dimensions of energy, the parameter pi 
is dimensionless, and both parameters are positive. Fur- 
ther discussion and motivation for this parameterization 
is provided in [IB] , 

A. Number of Events with N?° t > 3 

In order to predict the total number of events with 
-Wjet > 3, we use the jet E T distributions in the N?° t < 
2 control regions. Since jets are counted above an Et 
threshold (in this case 30 GeV), the iVj et distribution 
is completely determined from the jet Et distributions. 
To illustrate this, and to describe the method, standard 
model Z — > fi[i Monte Carlo simulations are used. After 
validation with control samples, the method is applied to 
the Z data. 
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FIG. 2: Et distribution of the third highest Et jet in stan- 
dard model Z —* nfj, Monte Carlo simulations. Events with 
N?° t < 2 have E T < 30 GeV; events with N?° t > 3 have 
E T > 30 GeV. 



FIG. 3: Et distribution of the third highest Et jet in stan- 
dard model Z — > Monte Carlo events. The distribution is 
fit to Eq. in the range 15 < Et < 30 GeV, and extrapo- 
lated to the E T > 30 GeV region. 



In Fig. [2] the Et distribution of the third highest jet 
is shown. By construction, a cut on < 2 separates 
this distribution into two regions. This distribution can 
be fit in the Et < 30 GeV region and extrapolated to 
the Et > 30 GeV region to get the expected number of 
background events with N?? t > 3. 

We fit the parameterization from Eq. to the jet Et 
distribution of Fig. 2j and show the results in Fig. 3] 26J. 
The fit matches well the broad features of the distribution 
above 30 GeV. The number of events with > 3 is 
then predicted by integrating the fitted distribution from 
30 GeV to infinity. The fit prediction obtained with this 
method (with its uncertaint y fro m fit parameter error 
propagation described in Sec. V C I is 116!}!] events (with 
the number of generated Monte Carlo events having an 
equivalent luminosity of 7 fb _1 ). The number of events 
observed in the simulated data with N?® t > 3 is 152. 
In this case, the extrapolation predicts the background 
to within 31 ± 16%. The level of consistency will be 
evaluated further in the validation studies with data in 
Sec. EE 

This method, using the jet Et distributions to pre- 
dict integrals of the Nj et distribution, can clearly be ex- 
tended to other analyses as well. For illustration pur- 
poses only we describe other examples here, still using 
standard model Z — > Monte Carlo simulation. Con- 
sider predicting the total number of events with N?® t > 1 
(that is, we require at least one jet with an Et thresh- 
old of 80 GeV). In this fit to the highest Et jet 
distribution below 80 GeV can be extrapolated to above 
that threshold, as in Fig. [3] (Note that the highest Et 
distribution in this figure is harder than the third high- 
est Et jet distribution, as one expects when ordering the 
jets by Et)- It is clear that the extrapolation describes 
the distribution reasonably well. 

If we instead wish to predict the number of events with 



Njtt > 1, we must fit the same Et distribution below 
40 GeV and extrapolate it to above that threshold, also 
shown in Fig. [I] It is clear that the extrapolation does 
not describe the high Et portion of the distribution well. 
There is a large systematic uncertainty present in extrap- 
olations that use such a small portion of the distribution 
that the shape can not be reliably obtained. This can be 
mitigated by raising the Et threshold, unless the shape of 
the jet Et distribution at high Et can be otherwise con- 
strained. In the case examined in this analysis, we fit the 
third highest Et jet (which has a softer Et distribution 
than the highest E T jet) in the region E T < 30 GeV. We 
have checked that the data in this region constrains the 
shape sufficiently with validation studies using control 
samples of data and Monte Carlo simulations, described 
later in Sec. lVDl 

From the above, it is apparent that one can estimate 
the background for events with N* t > n by fitting the 
Et distribution of the n th highest Et jet in the region 
Et < X and extrapolating the fit to the region Et > X, 
as long as the fit region Et < X constrains the shape 
sufficiently. 



B. Jt Shape Determination 

We now describe the method used to determine the 
shape of the J T ° distribution of events with N?? t > 3. Af- 
ter finding the shape, it is then normalized to the number 
of events with N?? t > 3 found by the above method. We 
again use standard model Z — > /i/i Monte Carlo events 
to explain the method, and later will apply it to data. 

Since J T ° is simply the sum of the individual jet trans- 
verse energies above 30 GeV, if the Et distributions of 
jets for events with N?® t > 3 are known, the J T ° distri- 
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Highest jet E T (GeV) 0? (GeV) 



FIG. 4: Et of the highest Et jet in standard model Z — > fifi 
Monte Carlo events. The distribution is fit to Eq. |TJ in the 
region 20 < Et < 80 GeV (dotted line), and again in the 
region 20 < E T < 40 GeV (solid line). 



FIG. 5: Et distribution of jets in N?P t = 1 events (open 
squares) and in jVjet = 2 events (solid circles) in Z — t 11 data. 
Events with higher N??t have harder Et spectra. 



bution can be predicted for these events. We extrapolate 
the shape of these jet Et distributions from the jet Et 
distributions of N?® t < 2 events. In order to do such an 
extrapolation, we must understand the variation of the 
jet Et distribution as a function of N^.. 

The Et distributions of all jets in events with N^. = 1 
and 2, normalized to have equal area, is shown in Fig. [5] 
using Z —* 11 data. The general shape is similar, though 
jets in N?® t = 2 events have a slightly harder tail at high 
Et- We model this by fitting to each jet Et distribution 
(using Eq. Q) and extrapolating the fit parameters to 
•Wjot — 3 events. To avoid simultaneously extrapolating 
two fit parameters we only extrapolate the exponential 
parameter (pi), as this parameter governs the high Et 
behavior in our parameterization. In order to extrapolate 
only this parameter, we fit the N^. = 1 Et spectrum al- 
lowing both parameters to float freely, then fix the power 
law parameter (p 2 ) in the fit to the N?® t = 2 Et spec- 
trum. We then extrapolate the pi parameter of Eq. 
linearly as a function of N^, from their fitted values at 
N?° t = 1 and N?° t = 2 into the region N^ c ° t > 3. 

Figures [6] and [7] show the fits of the spectra for events 
with 1 and 2 jets. Figure [8] shows the linear extrapo- 
lation of the exponential parameters. For illustration, 
the exponential parameter obtained from a fit to the Et 
distribution in N?® t — 3 events (again fixing the power 
law parameter to that found in the N^l = 1 events) is 
shown on the same figure. The extrapolation reasonably 
predicts the parameter for events with N?® t = 3 [2"T] . 

This dependence of the jet Et spectra on N?® t is mod- 
eled as described by our parameter extrapolation, allow- 
ing us to predict the shapes of the jet Et spectra for 
events with N?° t > 3. The j|° distribution is now al- 
most completely determined. Only an estimate for the 
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FIG. 6: Et distribution of jets in N?^ = 1 events in standard 
model Z — > fi/j, Monte Carlo events. The distribution is fit to 
Eq. Q in the range E T > 30. 



relative fractions of events with 3, 4, 5, ... jets is needed. 
For this, we use an exponential fit parameterization, fit 
to the N?° distribut ion in the region N?® t < 2, and use 
this shape in the N?® t > 3 region. This fit is shown in 
Fig. [9] There is no theoretical motivation for an expo- 
nential shape; we merely use it as an estimate, and verify 
that the J T ° prediction does not strongly depend on the 
chosen parameterization. As the total number of events 
with N?® t > 3 is already constrained using the method 



"jet 



from Sec. V A the dependence of the J T ° distribution on 



the exponential parameterization of the N?® t distribution 
is small. 

Finally, given the above shapes, it is straightforward to 
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FIG. 7: Et distribution of jets in Nf° t = 2 events in standard 
model Z — + fifi Monte Carlo events. The distribution is fit to 
Eq. |TJ in the range Et > 30, with the parameter p2 fixed to 
that obtained from Fig. [6] 



FIG. 9: Nf ct distribution in standard model Z — + fj,fi Monte 
Carlo events, fit to an exponential in the range N??t < 2. This 
shape is used to estimate the relative fractions of events with 
3, 4, 5, ... jets. 
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FIG. 8: The extrapolation of the exponential parameter pi 
vs. -/Vjg° in standard model Z — > /i/i Monte Carlo events. 



The process is repeated as necessary until the J T ° shape 
is obtained to the desired level of statistical precision. 

On step [2] the jet Et shapes are independently sam- 
pled; however, there is potentially some correlation be- 
tween the individual jet energies. Including this corre- 
lation in the J T ° shape prediction would have the effect 
of making the tail at large values of J T ° slightly harder. 
In the validation studies in Sec. |VD| we verify that the 
correlation is below the level necessary to affect the fit 
prediction. To understand this further, in Fig. |10| we 
plot the Et of one the jets versus the other in events 
with N?£ = 2 in the Z —> £1 data. There is no correla- 
tion evident in the plot; in the 663 events with N?? t = 2, 
only a small correlation of 25% is found, indicating that 
independently sampling the Et distribution is a reason- 
able approximation. 



C. Uncertainties on Fit Prediction 



make a simple Monte Carlo program that samples these 
shapes to get the J T ° distribution. The steps required to 
make this J|P prediction are: 

1. For each event, generate the number of jets by ran- 
domly sampling the predicted N?® t distribution in 
the range {3, 4, 5, ...}. 

2. Take the appropriate jet Et distribution for this 
number of jets after extrapolating the exponential 
fit parameter. Independently sample this jet Et 
distribution for each jet. 

3. Sum these jets to obtain the J T °- 



There arc two sources of uncertainty on the mean back- 
ground prediction: the statistical uncertainty from the 
finite amount of data in the fits, and the systematic un- 
certainty from imperfect modeling of the various shapes 
in the fits. 



1. Statistical Uncertainty on Fit Prediction 

The third highest Et jet normalization fit predicts the 
total number of events with N^f. > 3, using the parame- 
ter values at the minimum — log L, where L is the likeli- 
hood (or equivalently, the maximum likelihood) . The la 
uncertainty on the number of events is simply obtained 
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FIG. 10: The Et of a random jet vs. the Et of the other, 
using jets with jVj 3 c t = 2 in Z —>■ ££ data. 

from its values at the minimum — log L + ^ . Since the 
total number of events with N?? t > 3 is given by a single 
fit, its uncertainty is easily determined with this method. 

The J T ° prediction is obtained by extrapolating the 
behavior of multiple distributions, and to estimate its 
shape uncertainty we vary each fit parameter indepen- 
dently within its uncertainty (output by the fit) and re- 
do the extrapolation procedure. The individual uncer- 
tainties are combined in quadrature to obtain the total 
uncertainty. The normalization error is then added in 
quadrature as well to obtain the uncertainty on the fully- 
normalized J T ° distribution. 



Distribution nominal range "—la" range "+lcr" range 

Third highest E T jet (15, 30) GeV (15,26) GeV (17,30) GeV 

Nf° = 1 jet E T (30, oo) GeV (30, 150) GeV (70, oo) GeV 

Nfg = 2 jet E T (30, oo) GeV (30, 80) GeV (50, oo) GeV 
iVfet shape [0, 2] jets [0, 1] jets [1,2] jets 



TABLE II: Nominal fit ranges and the fit range changes used 
to estimate systematic uncertainties. The nominal fit range 
of each distribution is shown in the second column. The third 
and fourth columns show the ranges used to estimate the 
uncertainty from a mis-parameterization of that distribution. 





FIG. 11: The prediction for the J T ° distribution (blue line) 
of standard model Z Monte Carlo and its uncertainty (gray 
band), compared to the actual distribution (black points with 
errors) . 



2. Systematic Uncertainty on Fit Prediction 

As the background from Z+jet events is determined 
from a fit to the data, the only source of systematic 
uncertainties is mis-parameterization of those data. If 
the data were poorly parameterized, fitting a subset of 
the data would give a large change in the background 
prediction. We therefore estimate the size of the mis- 
parameterization uncertainties by changing the range of 
each fit and re-doing the fit procedure to obtain the J T ° 
normalization and shape prediction. Both uncertainties, 
that on the total number of events with N?® t > 3 (from 
the third highest Et jet fit), and that on the J T ° shape, 
are estimated in this way. The variations from each fit 
range change are then added in quadrature to obtain the 
full uncertainty. The fit range changes are summarized 
in Table |nj The "±1<t" range changes are chosen to give 
sufficient coverage when observed in control samples of 
data. 

Finally, using the technique and the uncertainties de- 
veloped above in the Monte Carlo simulation, we can 
demonstrate that the method is self-consistent by check- 



ing the normalized J T ° prediction for events with N?® t > 
3 matches that observed in Monte Carlo events. This 
comparison is shown in Fig. |11| The observed distribu- 
tion agrees well with the prediction. 



D. Validation of Technique 

Having demonstrated and described the procedure for 
obtaining the Z+jet background using Monte Carlo sim- 
ulation, its validation, done predominantly in data, is 
now described. The Z+jet data cannot be used as a 
validation sample because of potential signal bias, so 
we must test on other data samples. We use two sets 
of multi-jet data as background-only validation samples, 
and 14^+jet data as a background sample containing a 
real heavy quark signal from tt production. Finally, we 
do signal-injection studies with Monte Carlo simulations 
to understand the effect of signal bias on the fit proce- 
dure. 
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FIG. 12: Et distribution of the third highest Et jet in 
"X" +jet events selected with the jet triggers as described in 
the text. The distribution is fit to Eq. |TJ) in the 15 < Et < 
30 GeV region and extrapolated to the Et > 30 GeV region. 



FIG. 13: E T distribution of jets in N?° t = 1 "X"+jet events, 
selected with the jet triggers as described in the text. The 
distribution is fit to Eq. |l]) in the Et > 30 GeV region. 



1. Multi-Jet Data 

The Z+jet background extrapolation only requires in- 
formation about the jet Et distributions, and not the 
Z . It should therefore perform similarly well not only 
for Z+jet events, but "X"+jet events, provided that the 
"X" has a similar transverse momentum spectrum to the 
Z . For example, if the "X" has a minimum pt thresh- 
old, the Et distributions of the jets will be sculpted such 
that they no longer follow the power law x exponential 
parameterization of Eq. (|T|) . 

We first obtain "X"+jet events from multi-jet data 
dominated by QCD interactions using prescaled jet trig- 
gers that require at least one jet with Et > 20 GeV [2"8] , 
An "X" is then constructed by picking two random jets 
in the event, requiring they both have Et > 20 GeV 
(to match the electron and muon pt cuts) , and requiring 
Mx > 70 GeV/c 2 to remove the invariant mass turn-on. 
The invariant mass is not further restricted to the region 
81 < Mx < 101 GeV/c 2 to maximize statistics; in any 
case the J|P distribution is observed to not depend on 
Mx in this sample. 

Given this "X" selection, the remaining jets in the 



event are used to validate the procedure. Figure [12]shows 
the third highest Et jet distribution. We extrapolate this 
distribution above 30 GeV using Eq. . A prediction of 
97±27 (statistical uncertainty only) events with N^. > 3 
is obtained. 80 events are observed. This is consistent 
within the uncertainties. To quantitatively evaluate the 
level of consistency we calculate the probability to mea- 
sure the observed number of events or higher given the 
background prediction, as well as convert this probabil- 
ity to units of standard deviations [29]. This calculation 
gives a corresponding probability of 0.73; this is a 0.6a 
level of consistency. 



We now predict the J|P shape. Figures 13 and 14 show 
the fits to the jet Et spectra for events with N?? t — 1 and 
2. We extrapolate the parameter p\ using the plot in 
Fig. 
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to events with N?£ > 3. The N?° shape is taken 



jet 



fromTnc fit in Fig. [16] Using these ingredients, the simple 
Monte Carlo program is used to obtain the J T ° shape, 
which is normalized to the prediction of 97 events with 
Ajpj > 3. The prediction and total uncertainty is shown 
overlaid with the actual distribution in "X"+jet data in 
Fig. ~ 
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The distribution clearly agrees well within the 
uncertainty envelope. 

Because the J T uncertainties in each bin are corre- 
lated, an independent data/background comparison in 
each bin is not straightforward. Rather, we test the shape 
agreement once using the (arbitrarily chosen) region of 
J|° > 200 GeV. Above 200 GeV, 19.7+^ events are ex- 
pected and 20 events are observed. 

The background extrapolation method can accurately 
predict the normalization and shape of the J T ° distribu- 
tion in the jet triggered sample. However, because of the 
prescale, this sample has relatively low statistics despite 
the large cross section of QCD multi-jet processes. To 
obtain a higher statistics sample of multi-jet data, we 
can use the electron triggers, which are not prescaled. In 
this sample we construct an "A" by pairing the triggered 
electron with a "fake" electron, which is an EM calorime- 
ter cluster that is reconstructed as an electron but fails 
the low hadronic energy requirement. " X" events se- 
lected in this way are dominated by QCD dijet events. 
Again, Mx > 70 GeV/c 2 is required to remove the in- 
variant mass turn-on. Additionally the invariant mass 
region 81 < Mx < 101 GeV/c 2 is vetoed to remove real 
Z — > ee events. Figure [18] shows the plot of the invariant 
mass before these requirements. 

Given this "A" selection, the remaining jets in the 



event are used to validate the procedure. Figure 19 shows 
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FIG. 14: E T distribution of jets in Nf£ = 2 "X"+jet events FIG. 16: Nf° t distribution in "X"+jet events selected with 



selected with the jet triggers as described in the text. The 
distribution is fit to Eq. |l]) in the Et > 30 GeV region with 
the parameter p2 fixed to that obtained from the fit in Fig.|13| 



jet 

the jet triggers as described in the text. The distribution is 
fit to an exponential in the range iVjet < 2. 
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FIG. 15: The extrapolation of the exponential parameter pi 
vs. iVjet in "X"+jet events selected with the jet triggers as 
described in the text. 



FIG. 17: The prediction (blue line) and uncertainty (gray 
band) for the Jt° distribution of "X" +jet events selected with 
the jet triggers as described in the text. The prediction is 
compared to the actual distribution (black points with errors) . 
The observation agrees with the prediction. 



the third highest Et jet distribution. We extrapolate this 
distribution above 30 GeV using Eq. . A prediction of 
4427+^0 (statistical uncertainty only) events with Nf$. > 
3 is obtained. 4509 events are observed. Approximating 
the Poisson distribution of the number of observed events 
as a Gaussian, this is a 0.23cr level of consistency. 

The J T ° shape is predicted using the previously de- 
scribed procedure of extrapolating the jet Et distribu- 
tions from events with N?° t = 1 and 2 to N?? t > 3. 
The normalized prediction and its uncertainty are com- 
pared to the actual distribution in the data in Fig. [20] 
The distribution agrees well within the uncertainty en- 
velope. Above 200 GeV, 1412^212 events are expected; 



1128 events are observed, for a —1.3a level of consistency. 
The background prediction is compared to the numb er of 



III 



observed events as a function of the J T ° cut in Table 
The prediction agrees well over the entire J T ° distribu 
tion. 

We have seen that the background extrapolation per- 
forms well enough in this high-statistics validation sam- 
ple. Because of the high-statistics, this sample can be 
divided into subsamples and test the prediction method 
many times over. The electron-triggered multi-jet data 
is divided into 50 subsamples to check the background 
estimation with a sample size similar to that expected in 
the ^+jet data. 
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FIG. 18: Distribution of Mx in "X"+jet events selected from 
the electron triggers as described in the text. The shaded 
regions are removed; that is, events with Mx > 70 GeV/c 2 
are selected, and the 81 < Mx < 101 GeV/c 2 region is vetoed. 
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FIG. 19: Et distribution of the third highest Et jet in 
"X"+jet events selected with the electron triggers as de- 
scribed in the text. The distribution is fit to Eq. |l| in the 
15 < Et < 30 GeV region and extrapolated to the Et > 30 
GeV region. 



To validate the third highest Et jet extrapolation, we 
evaluate the consistency between the fit prediction and 
the observation in each subsample. The pull distribution 
from these calculations is observed to be consistent with 
a Gaussian with mean and width of 1, indicating that 
the mean prediction and the uncertainties are correctly 
calculated for the N?® t > 3 prediction. On average, the 
background prediction is 3 ± 5% low relative to the data. 
That is, the background prediction underestimates the 
background, but by an amount consistent with zero. This 
is consistent with the fit done in standard model Z Monte 
Carlo simulation in Sec. V A in which the background 
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TABLE III: The "X"+jet data (selected with the electron 
triggers as described in the text) vs. J T °, compared with the 
background prediction. 
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FIG. 20: The prediction (blue line) and uncertainty (gray 
band) for the J T ° distribution of "X" +jet events selected with 
the electron triggers as described in the text. The prediction 
is compared to the actual distribution (black points with er- 
rors). The observation agrees with the prediction, with a 
maximum fluctuation downward of 1.9a. The data are below 
the prediction for several point because the shape uncertainty 
is correlated between bins. 



prediction was 31 ± 16% low relative to the data. 

To validate the J T ° shape prediction, in each subsam- 
ple we evaluate the consistency between the fit predic- 
tion and the observation using a cut of J T ° > 200 GeV. 
In this case, the resulting pull distribution was inconsis- 
tent with a Gaussian with mean and width 1. We find 
that the background prediction overestimates the num- 
ber of observed events, and that the uncertainty is overly 
conservative, after correcting for this bias. On average, 
the background prediction is 23 ± 7% high relative to the 
data. However, we find that this bias is covered by the 
uncertainties, with an average uncertainty on the back- 
ground prediction of 47%. To clarify, these biases are 
only present in the J T ° shape prediction, and not in the 
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FIG. 21: The jf. distribution without the N?° t > 3 require- 
ment in the Z+jet data (black line), compared to "X"+jet 
data selected with the jet triggers (red histogram) and to 
"X"+jet data selected with the electron triggers (dotted blue 
line). 



N?° > 3 prediction. 

To compare the jet kinematics in each of the validation 
samples (both the "X" events selected from jet triggers 
and the "X" events selected from the electron triggers) 
to the Z+jet data, the J|P distribution of each is plotted, 
without the N?® t > 3 requirement, in Fig. 
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The over- 
all shape of each is the same, although they' are slightly 
different — for example, electron-triggered "X"+jet data 
have a harder spectrum. However, the background es- 
timation takes these differences into account in the fit 
procedure. 

These validations show that the fit prediction method 
correctly calculates the background when there is no sig- 
nal present. To verify that it calculates the background 
correctly in the presence of signal, we use W+jet data. 



2. W+jet Data 

The tree-level single W diagrams and the physics that 
gives rise to additional jets is similar to Z+jet produc- 
tion, and so similar behavior in the W+jet data is ex- 
pected. However, in the W+jet data, in addition to 
the single- W production there is also a heavy quark 
signal from the top quark, producing W bosons via 
tt — > WWbb. This sample provides a useful and inter- 
esting validation of the method — it is a real data sample 
that can test whether or not the background fit proce- 
dure performs properly in the presence of a signal similar 
to that of the search. 

W events in the W —> [iv channel are selected by re- 
quiring exactly one "tight" muon and missing transverse 
energy ($t)- T ne $t is measured using the vector sum 
of the calorimeter tower transverse energies and the muon 



Pt- $t > 25 GeV is required. Since only a single muon 
is required, this is the so-called "lepton+jets" channel of 
the top quark selected with only kinematic information, 
and without tagging 5-jets [31"] . 

Using this W+jet selection, we test the extraction of 
the top signal for events with N?® t > 3 using only data 
as a validation of the method for predicting the Z+jet 
background. We expect standard model W+jet to be the 
dominant background for tt after the N?® t requirement. 
In single W+jet Monte Carlo simulation with no tt com- 
ponent, the method does predict the actual Monte Carlo 
distribution well. We then apply the same method to the 
W+jet data, fitting the third highest Et jet distribution 
to Eq. ([I]) in Fig. 22 In this case, the extrapolation does 
not describe the data well. 

The extrapolation predicts 439^0 (stat.) ^4 ( s Y s t-) 
events; 762 events are observed. We make the hy- 
pothesis that this excess is due to the top quark, and 
test this by checking that the cross section is consis- 
tent with that expected for tt. The excess of the data 
above the background gives the number of tt candi- 
dates, 323±i^ (stat.) tH (syst.)- Using tt Monte Carlo 
events gives an estimate for the product of acceptance 
and efficiency of 3.41 ± 0.02%. The luminosity of the 
muon-triggered sample is 1.04 fb^ 1 . A cross section of 
9 ± 1 pb (stat. uncert. only) [SD] is therefore obtained. 
The proximity to the previous measured cross section in 
this channel at CDF using 194 pb" 1 , 6.6 ± 1.1 (stat.) ± 
1.5 (syst.) pb [3T], indicates that the excess is consis- 
tent with the background+tt hypothesis, and that the fit 
procedure is accurately predicting the background from 
single W+jet production in the presence of signal. 

A prediction is now made for the J T ° shape of the 
W+jet background. Figures [23] and [24] show the fits to 
the jet Et spectra for events with N?^ = 1 and 2; Fig 



shows the parameter p\ extrapolation; Fig. 26 shows trie 
N?? t shape fit. We use these shapes to obtain the J T ° 
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shape and errors, add the expected contribution from tt 
using Monte Carlo simulation (normalized to the "mea- 
sured" cross section of 9 pb), and compare this to the 



actual distribution in data in Fig. 27 The observed data 
are well described by the total J T {> prediction, verifying 
that the fit procedure can predict the J T ° shape of the 
background in the presence of signal. 

While the predicted shape of the J T ° distribution 
agrees with the data well (after adding the expected 
contribution from tt) , the total uncertainty on the back- 
ground prediction becomes extremely large at high J T °. 
The J T ° distribution for tt peaks near 200 GeV, where 
the uncertainty is small, but it is instructive to under- 
stand the reason for the increased uncertainty at very 
large J T °. This large error is completely dominated by 
a poor parameterization of the E t d istribution of jets in 
Since, 



iV?° = 2 events 



in Fig. 24 the fitted parame- 
terization poorly describes the data, changing the range 
from nominal (our method for determining the size of 
the mis-parameterization uncertainty) will make a large 
difference in the fit. However, this is not a problem with 
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FIG. 22: Et distribution of the third highest Et jet in 
W+jet events (black line and points). The distribution is 
fit to Eq. jl]) in the 15 < Et < 30 GeV region and extrap- 
olated to the Et > 30 GeV region. The dotted green line 
shows the contribution from ti at the "measured" cross sec- 
tion of 9 pb. There is very little contribution from ti within 
the fit region. The extrapolated distribution is inconsistent 
with the background-only hypothesis, but consistent with the 
background plus ti hypothesis. 
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FIG. 24: E T distribution of jets in iVj e ° = 2 W+jet events. 
The distribution is fit to Eq. {!]) in the E T > 30 GeV region 
with the parameter p2 fixed to that obtained from the fit in 
Fig. [23] 
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The extrapolation of the exponential parameter pi 
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FIG. 23: E T distribution of jets in Nf° t = 1 W+jet events. 
The distribution is fit to Eq. |l]) in the E T > 30 GeV region. 



the parameterization in Eq. ([IJ , because if the same spec- 
trum is fit without fixing the power law parameter to the 
value observed in events with N?^. = 1, the quite reason- 
able fit, shown in Fig. [28] is obtained. That is, the param- 
eterization still describes the N?® t — 2 Et spectrum well, 
but our method of fixing the power law parameter in this 
fit to that observed from the — 1 Ex spectrum does 
not describe the behavior of the changing jet Et distribu- 
tions as a function of N?® t well in this sample. In the other 



validation samples in data and Monte Carlo simulations, 
and particularly in the fits of the Z+jct data, we find no 
such large systematic effect from a mis-parameterization 
in the N?® t = 2 Et distribution. This issue therefore 
does not affect this analysis, but it suggests the back- 
ground prediction procedure could be enhanced with a 
more sophisticated parameter extrapolation, perhaps by 
extrapolating both parameters pi and P2 simultaneously. 

3. Signal Injection Studies 

The studies in data indicate the fit method adequately 
predicts the background, without and with the presence 
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FIG. 26: Af t distribution in W+jet events. The distribution 
is fit to an exponential in the range N^l < 2. 



FIG. 28: E T distribution of jets in jVj c ° = 2 W+jet events. 
The distribution is fit to Eq. {!]) in the E T > 30 GeV region 
without fixing the parameter p2- 
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FIG. 27: The prediction (cyan histogram) and uncertainty 
(dotted lines) for the Jy° distribution of W+jet events. The 
expectation from ti is added to the prediction. The data 
(points with errors) agree with the background plus ti hy- 
pothesis. 



of signal. We would also like to understand at what point, 
if any, signal contamination causes an unacceptably large 
change to the background prediction. That is, we need to 
verify that the background extrapolation does not "fit- 
away" the signal, as the jet Ej- distributions may be sub- 
stantially changed if there is a large amount of signal in 
the fitted regions. 

To study this effect we use standard model Z Monte 
Carlo events with b' — > bZ Monte Carlo events added at 
a variety of signal masses. An equivalent luminosity of 
1 fb _1 of Monte Carlo events is used to understand the 
effect with the approximate amount of statistics that is 
present in the data. For this study BR(b' -> bZ) = 100% 
is assumed; reducing this branching ratio will only reduce 



the effect of a signal bias. 

For example, the predicted J T ° distributions, gener- 
ated with and without my — 200 GeV/c 2 Monte Carlo 
signal events added to the Z+jet background fit, are 
shown in Fig. [29] The difference between the background 
predictions with and without signal is small compared to 
the actual number of Monte Carlo events, indicating that 
signal does not bias the fit to a large degree at this mass 
point. 

As expected, as the b' mass increases the fit becomes 
less biased from the presence of signal; as the b' mass 
decreases, the fit becomes more biased. At a b' mass 
of 150 GeV/c 2 , we found an increase in signal bias, but 
sensitivity to this mass point is still retained (at a signif- 
icance of 4.8a). At a b' mass of 100 GeV/c 2 , however, we 
found that the signal was completely fit away. We there- 
fore do not set limits below 150 GeV/c 2 . We note that 
this search is still sensitive to models with masses near 
100 GeV/c 2 , as long as the cross sections are sufficiently 
small as to not bias the fit. In general, though, lower 
masses produce more signal contamination than higher 
masses, as both the cross sections are larger and the Et 
distributions have larger fractions within the fit regions. 
Sensitivity to these lower masses could be increased by 
lowering Et thresholds and iVj G t cuts, and applying sim- 
ilar fit procedures with the altered selection. 



E. Application of Technique to the Signal Sample 

We now apply the fit technique to the combined 
Z — > ee and Z — > fifj, data to predict the background 
from Z+jet final states. The third highest Et jet distri 
bution is shown in Fig 



30 



with events that have N?® t > 3 
removed. We fit in ffie" region 15 < Et < 30 GeV, 
and extrapolate to the region Et > 30 GeV. We pre- 
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FIG. 29: Prediction for the jf.° distribution in standard 
model Z — > /i/i events, with and without the presence of a 
200 GeV/c 2 b' signal introduced. The difference between the 
two predictions is small compared to the excess of signal at 
large J|P. 



diet T2.2+^ 8 ! events with > 3. 

To obtain the J T ° shape of the Z+jet background, we 
fit the jet Et distributions of events with N^. = 1 and 
2, and linearly extrapolate the fit parameter pi to events 
with A^ 3 C ° > 3. The fit to the N™ t = 1 jet E T spec- 
trum is shown in Fig. 31 the fit to the iVj C t = 2 jet Et 



spectrum in Fig . [32] ana the extrapolation of the fit pa- 
rameter in Fig. 



33 



the 0, 1, and 2 jeT 



The fit to the N?° t distribution in 
" jins in Fig. [34] is used as an estimate 
of the shape of the N?® t distribution in the 3 and higher 
jet bins. With these ingredients, the simple Monte Carlo 
program is used to obtain the expected J|> shape, which 
is then normalized to the prediction for the total number 
of N?° > 3 background events, 72.2^^. The JfP dis- 
tribution prediction and its total statistical+systematic 
uncertainty is shown in Fig. [35] 



FIG. 30: E T distribution of the third highest E T jet in Z — > 
ee and Z — > /i/i events with < 2. The distribution is fit 
to Eq. Q in the 15 < E T < 30 GeV region and extrapolated 
to the E T > 30 GeV region. Events with N?? t > 3 (equivalent 
to Et > 30 GeV, the hatched region) are removed from the 
distribution. 
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VI. REMAINING BACKGROUNDS 



After having estimated the contribution from Z+jet 
with the above technique, the remaining backgrounds 
listed in Sec. HVI are now estimated. 

The second background, multi-jet fakes, has approx- 
imately the same shape as the Z+jet background, and 
is therefore included in the fit procedure. This shape 
similarity is demonstrated when validating the procedure 



using multi-jet data in Sec. VP 1 above. Since this back- 
ground is already included in the Z+jet background es- 
timate, no further determination of it is needed. 

Nonetheless, its size is independently measured to con- 
firm that it is small relative to the Z+jet background. To 
obtain an upper bound on the multi-jet background, the 
sidebands of the Mu distribution for events with N?® t > 3 
are used. We attribute all of the events in the sidebands 



FIG. 31: E T distribution of jets in N?° t = 1 Z -> ee and 
Z — > n/i events. The distribution is fit to Eq. |l| in the 
E T > 30 GeV region. 



to multi-jet fakes, and interpolate from the sidebands 
into the 81 < M u < 101 GeV/c 2 region. Using this 
method, less than 11 ± 2 events from multi-jet fakes are 
predicted. The small size relative to the Z+jet back- 
ground, 72.2li' 1 8 1 , indicates that this background is rel- 
atively unimportant. 

While the third background, from multi-jet events oc- 
curring simultaneously with cosmic rays, is also included 
in the fit procedure as the jet Et spectra are similar to 
the Z+jet background, its size is again independently 
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FIG. 32: E T distribution of jets in Nf" t = 2 Z -* ee and 
Z —* /i/i events. The distribution is fit to Eq. in the 
Et > 30 GeV region with the parameter p2 fixed to that 
obtained from the fit in Fig. |31| 



>140 
O 

-1120 

Oh 

100 
80 
60 
40 
20 




1 



N 30 bin 

jet 



FIG 

vs 



33: The extrapolation of the exponential parameter pi 



in Z — > ee and Z —* /i/j, events. 



measured. This background is rejected using timing in- 
formation from the COT. That information is also used 
to estimate this background using the number of events 
rejected with the timing cut, combined with a measure- 
ment of the rejection efficiency in a sample of cosmic rays 
with high-purity. We find a negligible background. 

The remaining backgrounds are not included in the 
fit procedure since they contain jets from the decays of 
massive particles and so the jet Et spectra do not fol- 
low the parameterization in Eq. ([!]). They can be esti- 
mated with Monte Carlo simulations normalizing to the 
expected standard model cross sections. All remaining 
backgrounds are negligible relative to the Z+jet back- 
ground, the largest being from WZ, with an estimated 



FIG. 34: Nf° t distribution in Z — > ee and Z —> fifi events. 
The distribution is fit to an exponential in the range iVSJ < 2. 
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FIG. 35: The prediction (blue line) and uncertainty (gray 
band) for the jf. distribution of Z — > ee and Z —* /ifi events. 



contribution of 1.6 ± 0.1 events. Each of the background 
contributions to the N^. > 3 region is summarized in 



Table IV As the backgrounds from WZ, ZZ, and tt are 
negligible compared to the Z+jet background, they are 
excluded in the background estimation vs. J|P. 



VII. RESULTS 

We now compare the background prediction to the 
observation in the Z+jet data. From the third high- 
est Et jet extrapolation, 75.3tii 8 i events with > 3 



are predicted, and 80 events are observed. In Fig. [36 
the extrapolation is shown overlaid with the data. The 
data agree with the extrapolation well. The predicted 
J T a distribution is compared to that observed in data in 
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Process 



Background 



.Z+jet 

Multi-jet fakes 

Cosmics 
WZ 
ZZ 
tt 



72.2+5 l 1 s 1 
< 11 ±2 (included 
in Z+jet fit) 
negligible 
1.6 ±0.1 
0.7 ±0.1 
0.8 ±0.1 



Total 



75.3 



+9.8 
11.1 



TABLE IV: Summary of all backgrounds after selecting 
events with iVSj > 3, independent of jf, . 
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FIG. 36: Et distribution of the third highest Et jet in Z — > 
ee and Z — > /i/x events. The fit from Fig. [30] is overlaid. The 
fit extrapolation matches the distribution above 30 GeV well. 



Fig. [37] Again, the data agree with the prediction quite 
well. The predicted and observed number of events inte- 
grated above various JjP cut values are listed in Table [v] 
We search for an excess above the prediction at each JjV 
cut value. Even when ignoring the systematic uncertain- 
ties, the maximum difference upward has a significance 
of ±0.9er; the maximum difference downward has a sig- 
nificance of —1.4(7. 

Given that there is no significant excess present in the 
data, a cross section limit is set using the fourth genera- 
tion model. At each b' mass, the counting experiment is 
evaluated with the requirement J|P > myc 2 . The limit 
is set at a 95% confidence level by integrating a likeli- 
hood obtained using a Bayesian technique that smears 
the Poisson-distributed background with Gaussian ac- 
ceptance and mean background uncertainties [32 . The 
background and its uncertainty are taken from the fit pre- 
diction (listed in Table [V}; the product of acceptance and 
efficiency is taken from Monte Carlo simulation, with cor- 
rection factors applied to match the observed efficiency of 
leptons in Z — ► £ £ data. The uncertainty on the product 
of acceptance and efficiency is 10%, with the dominant 
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FIG. 37: The jfP prediction and uncertainty from Fig. 35 



compared to the observed distribution (black points and er- 
rors) in Z — > ee and Z — > fifj, events with N??t > 3. The 
prediction agrees well with the data. 
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TABLE V: The data compared to the Z±jet background fit 



prediction vs. Jj 



source from a jet energy scale uncertainty of 6.7% [2D] , 
the second dominant from a luminosity uncertainty of 
5.9%, and the remainder from Monte Carlo event statis- 
tics and imperfect knowledge of lepton identification effi- 
ciencies [16] , parton distribution functions |33j . and ini- 
tial and final state radiation. 

The 95% confidence level cross section limit as a func- 
tion of mass is shown in Fig. [38] In models with differ- 
ent acceptances, the acceptances of the fourth generation 
model (for these values, see Appendix [AJ simply need 
to be factored out and the acceptances of those models 
should be included. 

To set a mass limit on the fourth generation model, 
the b' cross section is calculated at leading order using 
PYTHIA, with the assumption that BR(b' — > bZ) = 100%. 
With this assumption, the mass limit observed is my > 
268 GeV/c 2 . The previous search on this model in the bZ 
channel obtained a limit of my > 199 GeV/c 2 [2], with a 
selection catered to the specific b' model by tagging &-jets 
using displaced vertices. 
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FIG. 38: Cross section limit vs. b' mass, set at a confidence 
level of 95%. In the acceptance calculation BR(b' — > bZ) = 
/3 — 100% was assumed. If /3 < 100%, the acceptance would 
scale by the factor 1 — (1 — /3) 2 , since the 6' is produced in 
pairs and only one of them is required to decay to a Z with 
our selection. In addition, non-Z decays could change the 
acceptance of the N^? t > 3 cut. 



was set at a 95% confidence level. 



APPENDIX A: ACCEPTANCE OF b' MODEL 

In Table |VI| the acceptance times efficiency to select 
b' -> bZ events (assuming BR(b' -> bZ) = 100%) af- 
ter the kinematic cuts is shown. As these acceptances 
include a factor from BR(Z —* ££), they are maximally 
BR(Z -> ee) + BR(Z -> W ) = 6.7%. 



b' mass (GeV) Acceptance (%) 


150 


1.05 


200 


1.44 


250 


1.61 


300 


1.66 


350 


1.77 



TABLE VI: Acceptances to select b' — > bZ events versus mass, 
after applying the N?? t > 3 and jf- > m b rc 2 requirements. 
These include a factor from the branching ratio of Z —» ee 
and Z — > fifi. If this factor is removed, the acceptances range 
from 8-14%. BR(b' -> bZ) = 100% was assumed. 



VIII. CONCLUSION 

We have presented the results of a search for new par- 
ticles decaying to Z bosons and jets. We developed and 
validated a new technique to predict the dominant back- 
ground from the data alone. This technique complements 
the phenomenological-based method of predicting back- 
grounds via Monte Carlo calculations of higher-order ma- 
trix elements and non-perturbative soft parton showers. 
The technique presented here has advantages of not re- 
quiring careful tuning of phenomenological parameters 
when comparing to data and not requiring the many 
resource-consuming iterations of Monte Carlo detector 
simulations. The speed with which it can be applied 
makes it an attractive tool for calculation of backgrounds 
in jet-rich environments at future experiments, including 
those at the Large Hadron Collider. 

In the application of the technique on CDF if+jet data, 
no significant excess above background was seen. A cross 
section limit was therefore set on a fourth generation 
model as a function of mass. A mass limit of my > 
268 GeV/c 2 using a leading-order b' cross section calcu- 
lation with the assumption that BR(b' — ► bZ) = 100% 
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